Add CDC support for mixed index mutations via Kafka by Copilot · Pull Request #4874 · JanusGraph/janusgraph

Copilot · 2025-10-29T15:00:40Z

Mixed indexes risk permanent inconsistency when index writes fail but graph mutations succeed. This implements CDC (Change Data Capture) to ensure eventual consistency by publishing index mutations to Kafka for asynchronous processing.

Architecture

New janusgraph-cdc module provides:

Event Model - CdcMutationEvent serializes index mutations (additions/deletions/metadata) as JSON
Producer - KafkaCdcProducer publishes events to Kafka with configurable reliability
Worker - CdcWorker consumes events and applies mutations to IndexProvider
Transaction Wrapper - CdcIndexTransaction intercepts mutations before commit
Configuration - New config namespace index.<name>.cdc.*

CDC Modes

dual (default) - Write to index AND publish to Kafka for redundancy
skip - Only publish to Kafka, skip direct index writes
cdc-only - Alias for skip

Configuration

index.search.cdc.enabled=true
index.search.cdc.mode=dual
index.search.cdc.kafka-bootstrap-servers=localhost:9092
index.search.cdc.kafka-topic=janusgraph-index-cdc

Usage

CDC workers run as separate processes:

CdcWorker worker = new CdcWorker(
    kafkaServers,
    topicName,
    consumerGroupId,
    indexProvider,
    indexRetriever
);
worker.start();

Index transactions can be wrapped programmatically:

CdcIndexTransactionFactory factory = new CdcIndexTransactionFactory(config);
CdcIndexTransaction tx = factory.wrapIfEnabled(baseIndexTx);
// Mutations are captured and published to Kafka

Future Work

Integration with Backend (requires core changes to wrap IndexTransaction creation)
End-to-end test with Cassandra + Kafka + ElasticSearch (complex test infrastructure)
Metrics/monitoring integration

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

packages.confluent.io
- Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.11/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.11/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.11 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.11/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/janusgraph/janusgraph org.codehaus.plexus.classworlds.launcher.Launcher clean compile -pl janusgraph-cdc -am -DskipTests=true (dns block)
repository.apache.org
- Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.11/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.11/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.11 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.11/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/janusgraph/janusgraph org.codehaus.plexus.classworlds.launcher.Launcher clean compile -DskipTests=true (dns block)
- Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.11/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.11/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.11 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.11/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/janusgraph/janusgraph org.codehaus.plexus.classworlds.launcher.Launcher clean compile -pl janusgraph-cdc -am -DskipTests=true (dns block)
- Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.11/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.11/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.11 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.11/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/janusgraph/janusgraph org.codehaus.plexus.classworlds.launcher.Launcher test -pl janusgraph-cdc -Dtest=CdcMutationEventTest (dns block)
s01.oss.sonatype.org
- Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.11/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.11/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.11 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.11/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/janusgraph/janusgraph org.codehaus.plexus.classworlds.launcher.Launcher clean compile -pl janusgraph-cdc -am -DskipTests=true (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

This section details on the original issue you should resolve

<issue_title>Support CDC mode for Mixed Index mutations</issue_title>
<issue_description>Mixed Indexes are part of an external system and that's why stale indexes may appear in the system during failures. This can sometimes lead to permanent stale indexes and requires either in-flight index fixing when the index is found or proper transaction recoveries to align data between different storage systems (for example Cassandra and ElasticSearch).
This proposal it to allow using built-in change data capture (CDC) when it's available for mixed index mutations.
In such case, this will guarantee eventual consistency between mixed indexes and actual graph data.

Cassandra natively supports CDC (doc) and there is also available Debezium Connector for Cassandra.

To support this we should change JanusGraph Mixed index mutation mode to skip any mutations during original transaction and instead relay on the Debezium Connector to process Mixed Index mutations later. Alternatively, we could do both (mutate mixed index during original transaction and then trigger mutations again during CDC processing - just to ensure consistency in case of failures).

Related work on transaction recovery process #4872 </issue_description>

<agent_instructions>Add an ability to re-use Cassandra CDC via Debezium Connector to apply mixed index mutation updates following eventual consistency.
Broad idea:

We store data in the storage system (Cassandra in this case).

When Cassandra stores data in the edgestore table it triggers CDC to some Kafka topic. Should use Debezium Connector.

We should run separate "JanusGraph CDC Workers" that read some CDC Kakfa topic and apply mutations to relevant mixed indexes.

We shouldn't push the whole partition to the Kafka Topic, but only the changes in that partition (i.e. separate columns that were affected: ADDED / UPDATED / DELETED).

Implement this solution and add relevant integration tests that check mixed index mutations via CDC (Cassandra + Kafka + ElasticSearch)</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes Support CDC mode for Mixed Index mutations #4873

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

linux-foundation-easycla · 2025-10-29T15:00:46Z

❌ - login: @Copilot / name: copilot-swe-agent[bot] . The commit (1e5b1cf, 32930a1, 5b06ebe, ab082a0, ceb1af1) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.

Implemented CDC support infrastructure including: - CDC event model and Kafka producer/consumer - CDC-aware index transaction wrapper - Configuration options for CDC modes - Basic unit tests Co-authored-by: porunov <17673243+porunov@users.noreply.github.com>

- Created CdcIndexTransactionFactory for managing CDC lifecycle - Added comprehensive README with usage examples - Implemented integration test with Kafka testcontainer - Added testcontainers junit-jupiter dependency Co-authored-by: porunov <17673243+porunov@users.noreply.github.com>

- Added comprehensive configuration tests - Fixed CdcConfiguration to lazily load config values when enabled - All tests passing successfully Co-authored-by: porunov <17673243+porunov@users.noreply.github.com>

- Remove support for 'cdc_only' format, use only 'cdc-only' - Set mode to null when CDC is disabled for clarity - Update README to clarify supported mode values - All non-Docker tests passing Co-authored-by: porunov <17673243+porunov@users.noreply.github.com>

porunov · 2026-06-28T10:19:12Z

Suppressed by #4906

Keep mixed indexes (ElasticSearch/Solr/Lucene) eventually consistent with the graph by deriving their updates from a Change-Data-Capture stream of the committed graph data, instead of a synchronous second write during the transaction that can diverge on failure and leave a permanently stale index. Pipeline (Apache Cassandra): commit (graph data only) -> Cassandra edgestore(cdc=true) -> Debezium -> Kafka -> CdcIndexUpdateWorker consumer group -> reindex-from-current-state -> ES bulk. Key design points: - Reindex-from-current-state: the worker reads each changed element's current graph state and fully replaces its index document (reusing IndexSerializer, like transaction recovery). This is idempotent and order-independent, so out-of-order or duplicate events still converge to the current state -- no strict ordering required, and a stale event can never overwrite a fresh value. - No dual write: the only synchronous write is to storage; the index is updated downstream from the committed change stream, so it cannot diverge. - Element-keyed Kafka partitioning gives per-element ordering and horizontal scaling via a consumer group; batches are de-duplicated and applied as one ElasticSearch _bulk per index. - At-least-once: offsets are committed only after a batch is durably applied; on failure the batch is reprocessed (rewind) rather than skipped, so the index eventually catches up. Configuration (opt-in, disabled by default): - storage.cql.cdc: emit the Cassandra cdc=true table option on the edgestore table. - index.[X].cdc.enabled / index.[X].cdc.synchronous: per-index dual mode (write synchronously AND via CDC) or cdc-only mode (skip the synchronous write; ES updated solely via CDC). Components: - janusgraph-core: per-index CDC config options, the commit-side skip hook in StandardJanusGraph, and MixedIndexUpdateApplier (the backend-agnostic reindex-from-current-state engine, covering vertex, edge and property-element mixed indexes). - janusgraph-cql: the storage.cql.cdc table option (no Kafka dependency in production code). - janusgraph-cdc (new module; core + kafka-clients): the CdcEventDecoder SPI, DebeziumCassandraJsonDecoder, CdcWorkerConfiguration, CdcIndexUpdateWorker, and the standalone CdcIndexUpdateWorkerMain runner. Testing: 39 tests, including unit/component coverage (decoder vs real serialized bytes, reindex engine over vertex/edge/property-element indexes, worker loop via Kafka MockConsumer, full-chain convergence over Lucene incl. vertex/edge add/update/remove and out-of-order delivery) and two real-container E2Es -- worker -> Kafka -> ElasticSearch, and the full Cassandra-CDC -> Debezium -> Kafka -> ElasticSearch pipeline. The full Debezium E2E is gated behind the cassandra-cdc-e2e Maven profile (auto-activated on Java 17+, required by Debezium 3.x and Testcontainers 2.x); the default Java 8/11 build excludes it and stays green. Docs: advanced-topics/cdc-mixed-index.md operator guide, a 1.2.0 changelog upgrade note, and the regenerated configuration reference. Fixes JanusGraph#4873 Replaces JanusGraph#4874 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>

Keep mixed indexes (ElasticSearch/Solr/Lucene) eventually consistent with the graph by deriving their updates from a Change-Data-Capture stream of the committed graph data, instead of a synchronous second write during the transaction that can diverge on failure and leave a permanently stale index. Pipeline (Apache Cassandra): commit (graph data only) -> Cassandra edgestore(cdc=true) -> Debezium -> Kafka -> CdcIndexUpdateWorker consumer group -> reindex-from-current-state -> ES bulk. Key design points: - Reindex-from-current-state: the worker reads each changed element's current graph state and fully replaces its index document (reusing IndexSerializer, like transaction recovery). This is idempotent and order-independent, so out-of-order or duplicate events still converge to the current state -- no strict ordering required, and a stale event can never overwrite a fresh value. - No dual write: the only synchronous write is to storage; the index is updated downstream from the committed change stream, so it cannot diverge. - Element-keyed Kafka partitioning gives per-element ordering and horizontal scaling via a consumer group; batches are de-duplicated and applied as one ElasticSearch _bulk per index. - At-least-once: offsets are committed only after a batch is durably applied; on failure the batch is reprocessed (rewind) rather than skipped, so the index eventually catches up. Configuration (opt-in, disabled by default): - storage.cql.cdc: emit the Cassandra cdc=true table option on the edgestore table. - index.[X].cdc.enabled / index.[X].cdc.synchronous: per-index dual mode (write synchronously AND via CDC) or cdc-only mode (skip the synchronous write; ES updated solely via CDC). Components: - janusgraph-core: per-index CDC config options, the commit-side skip hook in StandardJanusGraph, and MixedIndexUpdateApplier (the backend-agnostic reindex-from-current-state engine, covering vertex, edge and property-element mixed indexes). - janusgraph-cql: the storage.cql.cdc table option (no Kafka dependency in production code). - janusgraph-cdc (new module; core + kafka-clients): the CdcEventDecoder SPI, DebeziumCassandraJsonDecoder, CdcWorkerConfiguration, CdcIndexUpdateWorker, and the standalone CdcIndexUpdateWorkerMain runner. Testing: 39 tests, including unit/component coverage (decoder vs real serialized bytes, reindex engine over vertex/edge/property-element indexes, worker loop via Kafka MockConsumer, full-chain convergence over Lucene incl. vertex/edge add/update/remove and out-of-order delivery) and two real-container E2Es -- worker -> Kafka -> ElasticSearch, and the full Cassandra-CDC -> Debezium -> Kafka -> ElasticSearch pipeline. The real-container tests are gated behind the cassandra-cdc-e2e Maven profile (auto-activated on Java 17+, required by Debezium 3.x and Testcontainers 2.x); the default Java 8/11 build excludes them and stays green. CI: a dedicated workflow (.github/workflows/ci-cdc.yml) runs the cdc unit tests on Java 8 and 11 and the full real-container suite -- including the Cassandra-CDC -> Debezium -> Kafka -> ElasticSearch pipeline -- on Java 17 with Docker, so the integration is exercised on every change and guards against regression. Docs: advanced-topics/cdc-mixed-index.md operator guide, a 1.2.0 changelog upgrade note, and the regenerated configuration reference. Fixes JanusGraph#4873 Replaces JanusGraph#4874 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>

Initial plan

5b06ebe

Copilot AI assigned Copilot and porunov Oct 29, 2025

Copilot started work on behalf of porunov October 29, 2025 15:01 View session

Copilot AI and others added 4 commits October 29, 2025 15:30

Add CDC configuration tests and fix lazy initialization

ab082a0

- Added comprehensive configuration tests - Fixed CdcConfiguration to lazily load config values when enabled - All tests passing successfully Co-authored-by: porunov <17673243+porunov@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Support CDC mode for mixed index mutations~~ Add CDC support for mixed index mutations via Kafka Oct 29, 2025

Copilot AI requested a review from porunov October 29, 2025 15:51

Copilot finished work on behalf of porunov October 29, 2025 15:51

porunov mentioned this pull request Jun 28, 2026

feat: CDC-based mixed index synchronization (#4873) #4906

Draft

9 tasks

porunov closed this Jun 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add CDC support for mixed index mutations via Kafka#4874

Add CDC support for mixed index mutations via Kafka#4874
Copilot wants to merge 5 commits into
masterfrom
copilot/support-cdc-mode-mixed-indexes

Copilot AI commented Oct 29, 2025 •

edited

Loading

Uh oh!

linux-foundation-easycla Bot commented Oct 29, 2025 •

edited

Loading

Uh oh!

porunov commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

Conversation

Copilot AI commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architecture

CDC Modes

Configuration

Usage

Future Work

I tried to connect to the following addresses, but was blocked by firewall rules:

Comments on the Issue (you are @copilot in this section)

Uh oh!

linux-foundation-easycla Bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

porunov commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 29, 2025 •

edited

Loading

linux-foundation-easycla Bot commented Oct 29, 2025 •

edited

Loading